Digital authentication with digital and analog documents

ABSTRACT

Techniques for incorporating authentication information into digital representations of objects and using the authentication information to authenticate the objects. The authentication information may be made from information in one portion of the digital representation and incorporated into another portion of the digital representation that does not overlap the first portion. Where the digital representation is made into an analog form and that in turn is made into a digital representation and the second digital representation is verified, the two portions must further be non-overlapping in the analog form. The information from which the authentication information is made may exist at many levels: representations of physical effects produced by the object, representations of features of the object, codes that represent the object&#39;s contents, and representations of descriptions of the object. Also disclosed are a verification server and techniques for reducing errors by an OCR. The verification server verifies authenticated documents. When a document is verified, an identifier is associated with the document and the identifier is used to locate a key for the authentication information and in some cases a second copy of the authentication information. The verification process may also involve security patterns that are a physical part of the analog form. The error reduction techniques include an error code specifying characters in the object that are confusing to OCR devices, and the error code is used to correct the results of an OCR reading of an analog form.

CROSS REFERENCE TO RELATED PATENT APPLICATIONS

[0001] The present application is a divisional of U.S. Ser. No.10/136,172, which has the same inventor, title, and assignee as thepresent application and which is hereby incorporated into the presentapplication by reference. U.S. Ser. No. 10/136,172 will issue as U.S.Pat. No. 6,751,336 on Jun. 15, 2004. U.S. Pat. No. 6,751,336 is in turna divisional of U.S. Pat. No. 6,487,301, which in turn is acontinuation-in-part of U.S. Pat. No. 6,243,480, Jian Zhao, et al.,Digital authentication with analog documents, issued Jun. 5, 2001. Thepresent application contains the complete discussion of digitalauthentication with analog documents from U.S. Pat. No. 6,243,480, andthat patent is also incorporated herein by reference. The material thatwas added in the Continuation-in-part begins with the section titledAdditional classes of semantic information.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The invention relates generally to digital representations ofimages and other information and more specifically to techniques forprotecting the security of digital representations and of analog formsproduced from them.

[0004] 2. Description of the Prior Art

[0005] Nowadays, the easiest way to work with pictures or sounds isoften to make digital representations of them. Once the digitalrepresentation is made, anyone with a computer can copy the digitalrepresentation without degradation, can manipulate it, and can send itvirtually instantaneously to anywhere in the world. The Internet,finally, has made it possible for anyone to distribute any digitalrepresentation from anywhere in the world. From the point of view of theowners of the digital representations, there is one problem with all ofthis: pirates, too, have computers, and they can use them to copy,manipulate, and distribute digital representations as easily as thelegitimate owners and users can. If the owners of the original digitalrepresentations are to be properly compensated for making or publishingthem, the digital representations must be protected from pirates.

[0006] There are a number of different approaches that can be used:

[0007] the digital representation may be rendered unreadable except byits intended recipients; this is done with encryption techniques;

[0008] the digital representation may be marked to indicate itsauthenticity; this is done with digital signatures;

[0009] the digital representation may contain information from which itmay be determined whether it has been tampered with in transit; thisinformation is termed a digest and the digital signature often includesa digest;

[0010] the digital representation may contain a watermark, an invisibleindication of ownership which cannot be removed from the digitalrepresentation and may even be detected in an analog copy made from thedigital representation; and

[0011] the above techniques can be employed in systems that not onlyprotect the digital representations, but also meter their use and/ordetect illegal use. For an example of a system that uses encryption toprotect digital representations, see U.S. Pat. No. 5,646,999, Saito,Data Copyright Management Method, issued Jul. 8, 1997; for a generaldiscussion of digital watermarking, see Jian Zhao, “Look, It's NotThere”, in: BYTE Magazine, January, 1997. Detailed discussions ofparticular techniques for digital watermarking may be found in E. Kochand J. Zhao, “Towards Robust and Hidden Image Copyright Labeling”, in:Proc. Of 1995 IEEE Workshop on Nonlinear Signal and Image Processing,Jun. 20-22, 1995, and in U.S. Pat. No. 5,710,834, Rhoads, Method andApparatus Responsive to a Code Signal Conveyed through a Graphic Image,issued Jan. 20, 1998.

[0012] For an example of a commercial watermarking system that uses thedigital watermarking techniques disclosed in the Rhoads patent, seeDigimarc Watermarking Guide, Digimarc Corporation, 1997, available athttp://www.digimarc, com in March, 1998.

[0013]FIG. 1 shows a prior-art system 101 which employs the aboveprotection techniques. A number of digital representation clients 105,of which only one, digital representation client 1050) is shown, areconnected via a′ network 103 such as the Internet to a digitalrepresentation server 129 which receives digital representations fromclients 105 and distributes them to clients 105. Server 129 includes adata storage device 133 which contains copied digital representations135 for distribution and a management database 139. Server 129 furtherincludes a program for managing the digital representations 135, aprogram for reading and writing watermarks 109, a program forauthenticating a digital representation and confirming that a digitalrepresentation is authentic 111, and a program for encrypting anddecrypting digital representations 113. Programs 109, 111, and 113together make up security programs 107.

[0014] Client 105 has its own versions of security programs 107; itfurther has editor/viewer program 115 which lets the user of client 105edit and/or view digital representations that it receives via network103 or that are stored in storage device 117. Storage device 117 asshown contains an original digital representation 119 which was made bya user of client 105 and a copied digital representation 121 that wasreceived from DR Server 129. Of course, the user may have made originalrepresentation 119 by modifying a copied digital representation.Editor/viewer program 115, finally, permits the user to output digitalrepresentations to analog output devices 123. Included among thesedevices are a display 125, upon which an analog image 124 made from adigital representation may be displayed and a printer 127 upon which ananalog image 126 made from the digital representation may be printed. Aloudspeaker may also be included in analog output devices 123. Theoutput of the analog output device will be termed herein an analog formof the digital representation. For example, if the output device is aprinter, the analog form is printed sheet 126; if it is a displaydevice, it is display 124.

[0015] When client 105(j) wishes to receive a digital representationfrom server 129, it sends a message requesting the digitalrepresentation to server 129. The message includes at least anidentification of the desired digital representation and anidentification of the user. Manager 131 responds to the request bylocating the digital representation in CDRs 135, consulting managementdata base 139 to determine the conditions under which the digitalrepresentation may be distributed and the status of the user of client105 as a customer. If the information in data base 139 indicates tomanager 131 that the transaction should go forward, manager 131 sendsclient 1050) a copy of the selected digital representation. In thecourse of sending the copy, manager 131 may use watermark reader/writer109 to add a watermark to the digital representation, useauthenticator/confirmer 111 to add authentication information, andencrypter/decrypter 113 to encrypt the digital representation in such afashion that it can only be decrypted in DR client 105(j).

[0016] When client 1050) receives the digital representation, itdecrypts it using program 113, confirms that the digital representationis authentic using program 111, and editor/viewer 115 may use program109 to display the watermark. The user of client 1050) may save theencrypted or unencrypted digital representation in storage 117. The userof client 105(j) may finally employ editor/viewer 115 to decode thedigital representation and output the results of the decoding to ananalog output device 123. Analog output device 123 may be a displaydevice 125, a printer 127, or in the case of digital representations ofaudio, a loudspeaker.

[0017] It should be pointed out that when the digital representation isdisplayed or printed in analog form, the only remaining protectionagainst copying is watermark 128, which cannot be perceived in theanalog form by the human observer, but which can be detected by scanningthe analog form and using a computer to find watermark 128. Watermark128 thus provides a backup to encryption: if a digital representation ispirated, either because someone has broken the encryption, or morelikely because someone with legitimate access to the digitalrepresentation has made illegitimate copies, the watermark at leastmakes it possible to determine the owner of the original digitalrepresentation and given that evidence, to pursue the pirate forcopyright infringement and/or violation of a confidentiality agreement.

[0018] If the user of client 105(j) wishes to send an original digitalrepresentation 119 to DR server 129 for distribution, editor/viewer 115will send digital representation 119 to server 129. In so doing,editor/viewer 115 may use security programs 107 to watermark the digitalrepresentation, authenticate it, and encrypt it so that it can bedecrypted only by DR Server 129. Manager 131 in DR server 129 will, whenit receives digital representation 119, use security programs 107 todecrypt digital representation 119, confirm its authenticity, enterinformation about it in management data base 139, and store it instorage 133.

[0019] In the case of the Digimarc system referred to above, manager 131also includes a World Wide Web spider, that is, a program thatsystematically follows World Wide Web links such as HTTP and FTP linksand fetches the material pointed to by the links.

[0020] Manager program 131 uses watermark reading/writing program toread any watermark, and if the watermark is known to management database0.139, manager program 131 takes whatever action may be required, forexample, determining whether the site from which the digitalrepresentation was obtained has the right to have it, and if not,notifying the owner of the digital representation.

[0021] While encryption, authentication, and watermarking have made itmuch easier for owners of digital representations to protect theirproperty, problems still remain. One such problem is that the techniquespresently used to authenticate digital documents do not work with analogforms; consequently, when the digital representation is output in analogform, the authentication is lost. Another is that present-day systemsfor managing digital representations are not flexible enough. A third isthat watermark checking such as that done by the watermark spiderdescribed above is limited to digital representations available on theInternet. It is an object of the present invention to overcome the aboveproblems and thereby to provide improved techniques for distributingdigital representations.

SUMMARY OF THE INVENTION

[0022] One aspect of the invention is apparatus for determiningauthenticity of a digital representation of an object where the digitalrepresentation includes embedded first authentication information. Theapparatus includes a storage system in which stored secondauthentication information is associated with stored reference codes anda processor that receives the digital representation and a referencecode associated with the digital representation. The processor furtherincludes an authentication information reader. The processor employs thereference code to retrieve the second authentication information and theauthentication information reader reads the embedded firstauthentication information. The processor then uses the read firstauthentication information and the second authentication information todetermine authenticity of the digital representation. The apparatus mayalso include a key that is associated with the reference code, with theprocessor using the key to read the first authentication information.The second authentication information may be semantic information in thedigital representation which can be read by the authenticationinformation reader as described in the parent. The digitalrepresentation may have been made from an analog form and the analogform may have included a security pattern that is a physical part of theanalog form. The security pattern may be included with the digitalrepresentation and may be used in determining authenticity of thedigital representation. Further, there may be many of the apparatusesand they may be connected by a network; in that case, the reference codemay be used to route the digital representation to a particular one ofthe apparatuses.

[0023] Another aspect of the invention is apparatus for checking theauthenticity of an analog form that contains embedded firstauthentication information. The apparatus includes an analog formconverter that receives the analog form and makes a digitalrepresentation of at least the first authentication information and acommunications system. The analog form converter uses the communicationssystem to send the digital representation and a reference code to averification system that employs the reference code and the firstauthentication information to determine whether the analog form isauthentic and receive a notification whether the analog form isauthentic from the verification system. The reference code may either beincluded in the digital representation or simply sent in associationwith it. The verification system may employ the reference code to locatea key that is required to read the first authentication information ormay employ the reference code to locate second authenticationinformation. The analog form converter may analyze the digitalrepresentation before it is sent to determine whether the verificationsystem can check the authenticity of the digital representation. In oneapplication, the analog form is a photo ID and the reference code is anidentification number for the photo ID.

[0024] Other objects and advantages of the invention will be apparent tothose skilled in the arts to which the invention pertains upon perusingthe following Detailed Description and Drawing, wherein:

BRIEF DESCRIPTION OF THE DRAWING

[0025]FIG. 1 is a block diagram of a prior-art system for securelydistributing digital representations;

[0026]FIG. 2 is a diagram of a first embodiment of an analog form thatcan be authenticated;

[0027]FIG. 3 is a diagram of a second embodiment of an analog form thatcan be authenticated;

[0028]FIG. 4 is a diagram of a system for adding authenticationinformation to an analog form;

[0029]FIG. 5 is a diagram of a system for authenticating an analog form;

[0030]FIG. 6 is a diagram of an analog form that includes a securitypattern; and

[0031]FIG. 7 is a diagram of a network system for verifying authenticityof objects.

[0032] The reference numbers in the drawings have at least three digits.The two rightmost digits are reference numbers within a figure; thedigits to the left of those digits are the number of the figure in whichthe item identified by the reference number first appears. For example,an item with reference number 203 first appears in FIG. 2.

DETAILED DESCRIPTION

[0033] The following Detailed Description will first disclose atechnique for authenticating digital representations that survivesoutput of an analog form of the digital representation, will thendisclose active watermarks, that is, watermarks that contain programs,and will finally disclose watermark agents, that is, programs whichexamine the digital watermarks on digital representations stored in asystem and thereby locate digital representations that are being usedimproperly.

[0034] Authentication That is Preserved in Analog Forms: FIGS. 2-5

[0035] Digital representations are authenticated to make sure that theyhave not been altered in transit. Alteration can occur as a result oftransmission errors that occur during the course of transmission fromthe source of the digital representation to its destination, as a resultof errors that arise due to damage to the storage device being used totransport the digital representation, as a result of errors that arisein the course of writing the digital representation to the storagedevice or reading the digital representation from the storage device, oras a result of human intervention. A standard technique forauthentication is to make a digest of the digital representation andsend the digest to the destination together with the digitalrepresentation. At the destination, another digest is made from thedigital representation as received and compared with the first. If theyare the same, the digital representation has not changed. The digest issimply a value which is much shorter than the digital representation butis related to it such that any change in the digital representation willwith very high probability result in a change to the digest.

[0036] Where human intervention is a serious concern, the digest is madeusing a one-way hash function, that is, a function that produces adigest from which it is extremely difficult or impossible to learnanything about the input that produced it. The digest may additionallybe encrypted so that only the recipient of the digital representationcan read it. A common technique is to use the encrypted digest as thedigital signature for the digital representation, that is, not only toshow that the digital representation has not been altered in transit,but also to show that it is from whom it purports to be from. If thesender and the recipient have exchanged public keys, the sender can makethe digital signature by encrypting the digest with the sender's privatekey. The recipient can use the sender's public key to decrypt thedigest, and having done that, the recipient compares the digest with thedigest made from the received digital representation. If they are notthe same, either the digital representation has been altered or thedigital representation is not from the person to whom the public keyused to decrypt the digest belongs. For details on authentication, seeSection 3.2 of Bruce Schneier, Applied Cryptography, John Wiley andSons, 1994.

[0037] The only problem with authentication is that it is based entirelyon the digital representation. The information used to make the digestis lost when the digital representation is output in analog form. Forexample, if the digital representation is a document, there is no way ofdetermining from a paper copy made from the digital representationwhether the digital representation from which the paper copy was made isauthentic or whether the paper copy is itself a true copy of the digitalrepresentation.

[0038] While digital watermarks survive and remain detectable when adigital representation is output in analog form, the authenticationproblem cannot be solved simply by embedding the digest or digitalsignature in the watermark. There are two reasons for this:

[0039] Watermarking changes the digital representation; consequently, ifa digital representation is watermarked after the original digest ismade, the watermarking invalidates the original digest, i.e., it is nolonger comparable with the new digest that the recipient makes from thewatermarked document.

[0040] More troublesome still, when a digital representation is outputin analog form, so much information about the digital representation islost that the digital representation cannot be reconstructed from theanalog form. Thus, even if the original digest is still valid, there isno way of producing a comparable new digest from the analog form.

[0041] What is needed to overcome these problems is an authenticationtechnique which uses information for authentication which is independentof the particular form of the digital representation and which will beincluded in the analog form when the analog form is output. As will beexplained in more detail in the following, the first requirement is metby selecting semantic information from the digital representation andusing only the semantic information to make the digest. The secondrequirement is met by incorporating the digest into the digitalrepresentation in a fashion such that it on the one hand does not affectthe semantic information used to make the digest and on the other handsurvives in the analog form In the case of documents, an authenticationtechnique which meets these requirements can be used not only toauthenticate analog forms of documents that exist primarily in digitalform, but also to authenticate documents that exist primarily or only inanalog form, for example paper checks and identification cards.

[0042] Semantic Information

[0043] The semantic information in a digital representation is thatportion of the information in the digital representation that must bepresent in the analog form made from the digital representation if thehuman who perceive the analog form is to consider it a copy of theoriginal from which the digital representation was made. For example,the semantic information in a digital representation of an image of adocument is the representations of the alphanumeric characters in thedocument, where alphanumeric is understood to include representations ofany kind of written characters or punctuation marks, including thosebelonging to non-Latin alphabets, to syllabic writing systems, and toideographic writing systems. Given the alphanumeric characters, thehuman recipient of the analog form can determine whether a document is acopy of the original, even though the characters may have differentfonts and may have been formatted differently in the original document.There is analogous semantic information in digital representations ofpictures and of audio information. In the case of pictures, it is theinformation that is required for the human that perceives the analogform to agree that the analog form is a copy (albeit a bad one) of theoriginal picture, and the same is the case with audio information.

[0044] In the case of a document written in English, the semanticinformation in the document is the letters and punctuation of thedocument. If the document is in digital form, it may be representedeither as a digital image or in a text representation language such asthose used for word processing or printing. In the first case, opticalcharacter recognition (OCR) technology may be applied to the image toobtain the letters and punctuation; in the second case, the digitalrepresentation may be parsed for the codes that are used to representthe letters and punctuation in the text representation language. If thedocument is in analog form, it may be scanned to produce a digital imageand the OCR technology applied to the digital image produced byscanning.

[0045] Using Semantic Information to Authenticate an Analog Form: FIGS.2 and 3

[0046] Because the semantic information must be present in the analogform, it may be read from the analog form and used to compute a newdigest. If the old digest was similarly made from the semanticinformation in the digital representation and the old digest is readablefrom the analog form, the new digest and the old digest can be comparedas described in the discussion of authentication above to determine theauthenticity of the analog form.

[0047]FIG. 2 shows one technique 201 for incorporating the old digestinto an analog form 203. Analog form 203 of course includes semanticinformation 205; here, analog form 203 is a printed or faxed documentand semantic information 205 is part or all of the alphanumericcharacters on analog form 203. Sometime before analog form 203 wasproduced, semantic information 205 in the digital representation fromwhich analog form 203 was produced was used to make semantic digest 207,which was incorporated into analog form 203 at a location which did notcontain semantic information 205 when analog form 203 was printed. Insome embodiments, semantic digest 207 may be added to the originaldigital representation; in others, it may be added just prior toproduction of the analog form. Any representation of semantic digest 207which is detectable from analog form 203 may be employed; in technique201, semantic digest 207 is a visible bar code. Of course, semanticdigest 207 may include additional information; for example, it may beencrypted as described above and semantic digest 207 may include anidentifier for the user whose public key is required to decrypt semanticdigest 207. In such a case, semantic digest 207 is a digital signaturethat persists in the analog form.

[0048] With watermarking, the semantic digest can be invisibly added tothe analog form. This is shown in FIG. 3. In technique 301, analog form303 again includes semantic information 305. Prior to producing analogform 303, the semantic information in the digital representation fromwhich analog form 303 is produced is used as described above to producesemantic digest 207; this time, however, semantic digest 207 isincorporated into watermark 307, which is added to the digitalrepresentation before the analog form is produced from the digitalrepresentation and which, like the bar code of FIG. 2, survivesproduction of the analog form. A watermark reader can read watermark 307from a digital image made by scanning analog form 303, and can therebyrecover semantic digest 207 from watermark 307. As was the case with thevisible semantic digest, the semantic digest in watermark 307 may beencrypted and may also function as a digital signature.

[0049] Adding a Semantic Digest to an Analog Form: FIG. 4

[0050]FIG. 4 shows a system 401 for adding a semantic digest to ananalog form 203. The process begins with digital representation 403,whose contents include semantic information 205. Digital representation403 is received by semantics reader 405, which reads semanticinformation 205 from digital representation 403. Semantics reader 405'soperation will depend on the form of the semantic information. Forexample, if digital representation 403 represents a document, the formof the semantic information will depend on how the document isrepresented. If it is represented as a bit-map image, the semanticinformation will be images of alphanumeric characters in the bit map; ifit is represented using one of the many representations of documentsthat express alphanumeric characters as codes, the semantic informationwill be the codes for the alphanumeric characters. In the first case,semantics reader 405 will be an optical character reading (OCR) device;in the second, it will simply parse the document representation lookingfor character codes.

[0051] In any case, at the end of the process, semantics reader 405 willhave extracted some form of semantic information, for example the ASCIIcodes corresponding to the alphanumeric characters, from representation403. This digital information is then provided to digest maker 409,which uses it to make semantic digest 411 in any of many known ways.Depending on the kind of document the semantic digest is made from andits intended use, the semantic digest may have a form which requires anexact match with the new digest or may have a form which permits a“fuzzy” match. Digital representation 403 and semantic digest 411 arethen provided to digest incorporator 413, which incorporates arepresentation 207 of digest 411 into the digital representation used toproduce analog form 203. As indicated above, the representation must beincorporated in such a way that it does not affect semantic information205. Incorporator 413 then outputs the representation it produces toanalog form producer 415, which produces analog form 203 in the usualfashion. Analog form 203 of course includes semantic information 205 andrepresentation 207 of semantic digest 411. Here, the bar code is used,but representation 207 could equally be part of a watermark, as inanalog form 303. Components 405,409, and 413 may be implemented asprograms executed on a digital computer system; analog form producer 415may be any device which can output an analog form.

[0052] Authenticating an Analog Form that has a Semantic Digest

[0053]FIG. 5 shows a system 501 for authenticating an analog form 503that has a semantic digest 207. Analog form 503 is first provided tosemantic digest reader 505 and to semantics reader 505. Semantic digestreader 505 reads semantic digest 207; if semantic digest 207 is a barcode, semantic digest reader 505 is a bar code reader; if semanticdigest 207 is included in a digital watermark, semantic digest reader505 is a digital watermark reader which receives its input from ascanner. If semantic digest 505 must be decrypted, semantic digestreader 505 will do that as well. In some cases, that may require sendingthe encrypted semantic digest to a remote location that has the properkey.

[0054] Semantics reader 507 reads semantic information 305. If analogform 503 is a document, semantics reader 507 is a scanner which providesits output to OCR software. With other images, the scanner provides itsoutput to whatever image analysis software is required to analyze thefeatures of the image that make up semantic information 305. If analogform 503 is audio, the audio will be input to audio analysis software.Once the semantics information has been reduced to semantics data 509,it is provided to semantic digest maker 511, which makes a new semanticdigest 513 out of the information. To do so, it uses the same techniquethat was used to make old semantic digest 515. Comparator 517 thencompares old semantic digest 515 with new semantic digest 513; if thedigests match, comparison result 519 indicates that analog form 203 isauthentic; if they do not, result 519 indicates that they are notauthentic. What “match” means in this context will be explained in moredetail below.

[0055] “Matching” Semantic Digests

[0056] With the digests that are normally used to authenticate digitalrepresentations, exact matches between the old and new digests arerequired. One reason for this is that in most digital contexts,“approximately correct” data is useless; another is that the one-wayhashes normally used for digests are “cryptographic”, that is, the valueof the digest reveals nothing about the value from which it was made bythe hash function, or in more practical terms, a change of a single bitin the digital representation may result in a large change in the valueproduced by the hash function. Since that is the case, the onlycomparison that can be made between digests is one of equality.

[0057] In the context of authenticating analog forms, the requirementthat digests be equal causes difficulties. The reason for this is thatreading semantic information from an analog form is an error-proneoperation. For example, after many years of effort, OCR technology hasgotten to the point where it can in general recognize characters with98% accuracy when it begins with a clean copy of a document that issimply formatted and uses a reasonable type font. Such an error rate isperfectly adequate for many purposes; but for semantic information ofany size, a new digest will almost never be equal to the old digest whenthe new digest is made from semantics data that is 98% the same as thesemantics data that was used to make the old semantic digest. On theother hand, if the semantics data obtained from the analog form is 98%the same as the semantics data obtained from the digital representation,there is a very high probability that the analog form is in fact anauthentic copy of the digital representation.

[0058] Precise Matches

[0059] Of course, if the semantic information is limited in size andtightly constrained, it may be possible to require that the digests beexactly equal. For example, many errors can be eliminated if what isbeing read is specific fields, for example in a check or identificationcard, and the OCR equipment is programmed to take the nature of thefield's contents into account. For example, if a field contains onlynumeric characters, the OCR equipment can be programmed to treat theletters o and O as the number 0 and the letters l,i, or I as thenumber 1. Moreover, if a match fails and the semantic informationcontains a character that is easily confused by the OCR equipment, thecharacter may be replaced by one of the characters with which it isconfused, the digest may be recomputed, and the match may again beattempted with the recomputed digest.

[0060] Fuzzy Matches

[0061] Where the semantic information is not tightly constrained, thedigests must be made in such a fashion that closely-similar semanticinformation produces closely-similar digests.

[0062] When that is the case, matching becomes a matter of determiningwhether the difference between the digests is within a threshold value,not of determining whether they are equal. A paper by Marc Schneider andShih-Fu Chang, “A Robust Content Based Digital Signature for ImageAuthentication”, in: Proceedings of the 1996 International Conference onImage Processing, presents some techniques for dealing with relateddifficulties in the area of digital imaging. There, the problems are notcaused by loss of information when a digital representation is used tomake an analog form and by mistakes made in reading analog forms, butrather by “lossy” compression of images, that is, compression usingtechniques which result in the loss of information. Because the lostinformation is missing from the compressed digital representation, adigest made using cryptographic techniques from the compressed digitalrepresentation will not be equal to one made from the digitalrepresentation prior to compression, even though the compressed anduncompressed representations contain the same semantic information.Speaking generally, the techniques presented in the Schneider paper dealwith this problem by calculating the digest value from characteristicsof the image that are not affected by compression, such as the spatiallocation of its features. Where there are sequences of images, thedigest value is calculated using the order of the images in thesequences.

[0063] Analogous approaches may be used to compute the semantic digestused to authenticate an analog form. For example, a semantic digest fora document can be computed like this:

[0064] 1. Set the current length of a digest string that will hold thesemantic digest to “0”;

[0065] 2. Starting with the first alphanumeric character in thedocument, perform the following steps until there are no more charactersin the document:

[0066] a. Select a next group of characters;

[0067] b. for the selected group,

[0068] i. replace characters in the group such as O,0,o; I,i,l, 1; orc,e that cause large numbers of OCR errors with a “don't care”character;

[0069] ii. make a hash value from the characters in the group;

[0070] iii. append the hash value to the semantic digest string;

[0071] c. return to step (a).

[0072] 3. When there are no more characters in the document, make thesemantic digest from the digest string.

[0073] When computed in this fashion, the sequence of values in thesemantic digest string reflects the order of the characters in each ofthe sequences used to compute the digest. If the sequence of values inthe new semantic digest that is computed from the analog form has a highpercentage of matches with the sequence of values in the old semanticdigest, there is a high probability that the documents contain the samesemantic information.

[0074] Applications of Authentication with Analog Forms

[0075] One area of application is authenticating written documentsgenerally. To the extent that the document is of any length and thedigest is computed from a significant amount of the contents, the digestwill have to be computed in a fashion which allows fuzzy matching. Ifthe digest is computed from closely-constrained fields of the document,exact matching may be employed.

[0076] Another area of application is authenticating financial documentssuch as electronic cash, electronic checks, and bank cards. Here, thefields from which the digest is computed are tightly constrained and anexact match may be required for security. In all of these applications,the digest or even the semantic information itself would be encrypted asdescribed above to produce a digital signature.

[0077] Universal Paper & Digital Cash

[0078] Digital cash is at present a purely electronic medium of payment.A given item of digital cash consists of a unique serial number and adigital signature. Authentication using semantic information permitsdigital cash to be printed as digital paper cash. The paper cash isprinted from an electronic image which has a background image, a serialnumber, and a money amount. The serial number and the money amount arethe semantic information. The serial number and the money amount areused to make a digital signature and the digital signature is embeddedas an electronic watermark into the background image. The paper cash canbe printed by any machine which needs to dispense money. Thus, an ATM,can dispense digital paper cash instead of paper money. Similarly, avending machine can make change with digital paper cash and a merchantcan do the same. The digital paper cash can be used in the same way aspaper money. When a merchant (or a vending machine) receives the digitalpaper cash in payment, he or she uses a special scanner (including OCRtechnology and a watermark reader) to detect the watermark (i.e. theserial number and money amount) from the printed image, and send them tothe bank for verification in the same fashion as is presently done withcredit cards.

[0079] Digital Checks

[0080] Digital checks can be made using the same techniques as are usedfor digital paper cash.

[0081] The digital check includes a background image, an identifier forthe bank account, an amount to be paid, and the name of the payer. Thepayer's private key is used to make a digital signature from at leastthe identification of the bank and the amount to be paid, and thedigital signature is embedded as an electronic watermark in thebackground image. Writing a digital check is a three-step process: enterthe amount, produce the digital signature from the bank account numberand the amount using the payer's private key, and embed the digitalsignature into the background image. The bank verifies the check bydetecting the watermark from the digital check., decrypting the digitalsignature with the payer's public key, and comparing the bank accountnumber and the amount from the image with the bank account number andthe amount on the face of the check. A digital check can be used ineither electronic form or paper form. In the latter case, a scanner(including OCR technology and watermark reader) is needed to read thewatermark from the paper check.

[0082] Authentication of Identification Cards

[0083] The techniques described above for authenticating digital papercash or digital checks can be used with identification cards, includingbankcards. The card number or other identification information appearson the face of the card, is encrypted into a digital signature, and isembedded as a digital watermark in the background image of the bankcard.The encryption can be done with the private key of the institution thatissues the card. The merchant uses use a scanner to detect the digitalsignature (i.e. card number or other ID) from the card, and compare thesignature with the authentication stored inside the card. This techniquecan of course be combined with conventional authentication techniquessuch as the holographic logo.

[0084] Additional Classes of Semantic Information

[0085] As defined in the parent of the present patent application,semantic information is information that must be present in any analogform made from the digital representation of an object. Furtherconsideration of the necessary properties of semantic data has lead tothe realization that there are many kinds of semantic information andthat the semantic information may exist at a number of different levelsin a digital representation or an analog form.

[0086] at the signal level: the semantic information may be high-orderbits of image pixels or audio samples or the most significant frequencycomponents computed by a visual perception model for images and video orby an auditory perception model for audio.

[0087] at the vector level: the semantic information may be featuresthat are represented by vector data. Examples for images are edges,shapes, areas, and objects; for video, time relationships between framesmay be used as well. With audio, the instrumental or vocal sounds aresuch features.

[0088] at the level of content codes: the semantic information may becodes that represent the content of the object: One example of contentcodes is the codes that represent the alphanumeric characters indocuments, for instance the widely-used ASCII codes for alphanumericcharacters. These and other codes representing alphanumeric charactersare used in the files produced by various word processors and documentdistribution systems. Another example of content codes is the MIDI codesused to define the notes to be played in MIDI files.

[0089] at the appearance or presentation level: the semantic informationmay be a description of the appearance or presentation of the content.Examples are fonts, colors, sizes, and other appearance features of wordprocessor files, style tags and style sheets in HTML, XML or SGML filesand analogous features of MIDI files.

[0090] at the metadata level: metadata is information which is notitself part of the digital representation, but is a description of thecontents of the digital representation. The metadata may either appearin the analog form or be inferable from the analog form. Examples arelabels and captions in images and video, scripts in video and audio,mathematical descriptions of relationships between objects in images orvideo, and the words for a piece of music.

[0091] The authentication techniques of the parent patent applicationcan be used with semantic information belonging to any of the aboveclasses. To make the authentication information from a given kind of thesemantic information, one merely requires a device that can read theinformation. Examples are a function to compute the most significantbits of image pixels or audio samples, a device that recognizes objectsin images or video or audio features in audio, a device which reads themetadata, or a voice-to-text conversion device which converts voice totext (which is then used to compute the authentication information).

[0092] The semantic information can be used to authenticate digitalrepresentations made from analog forms of objects, as described in theparent of the present patent application and can also be used toauthenticate any digital representation, whether or not made from ananalog form. To make a digital representation that can be authenticatedfrom an analog form of an object, one employs devices that can sense thesemantic information in the analog form as described in the parent ofthe present application. Examples are a scanner, digital cameras andvideo cameras, a microphone and a recorder, or an analog to digitalconverter for signal information. Such devices are of course notnecessary if the object being authenticated was originally in digitalform. An example of authentication of objects that are never in analogform is authenticating video frames produced by a digital videosurveillance system.

[0093] A General Approach to Embedding Authentication Information: FIG.2

[0094] The parent patent application described how authenticationinformation could be included anywhere in a document as long as itspresence did not affect the semantic representation. The technique usedin documents is a specific example of the following general technique,namely computing the authentication information based on a part (P1) ofthe document or other object and embedding it in a part (P2) which doesnot overlap with P1. Since there is no overlapping, the modification ofP2 that is a consequence of embedding the authentication information init does not affect P1.

[0095] The general technique can be used with semantic information asdescribed above or with any other information in the object which mustremain unaffected when the authentication information is embedded in theobject. As can been seen from this fact, the technique is useful notonly for authentication of analog forms, but also for authentication ofdigital representations. Where no analog forms are involved, all that isrequired is that P1 does not overlap P2 in the digital representation.Where analog forms are involved, P1 must also not overlap P2 in theanalog form made from the digital representation. In the documentcontext of FIGS. 2 and 3, in FIG. 2, P1 209 is the characters of thedocument and P2 211 is the margin in which the barcode is placed; inFIG. 3, P1 is the characters and P2 is a portion of the watermark whichis separate from the characters. Other examples of the technique follow:

[0096] P1 is the M most significant bits of each image pixel's RGBvalues or audio samples, P2 is bits in the remaining least significantbits.

[0097] P1 is the M most significant frequency coefficients in a DCTblock (a image block transformed by Discrete Cosine Transformation), P2is frequency coefficients in the remaining least significant frequencycoefficients in the DCT block.

[0098] P1 is a specific region of an image in the spatial domain whichcontains all semantic information, while P2 is the remaining regions ofthe image.

[0099] P1 is text of a document, which contains all semanticinformation, while P2 is the image of the document, represented inpixels.

[0100] P1 is a text layer of a document, which contains all semanticinformation, while P2 is the background image layer of the document.

[0101] P1 is text of a document, which contains all semanticinformation, while P2 is a graphics (such as a seal, logo, stamp) in thedocument, which does not overlap with the text.

[0102] P1 is a class of semantic information (signal-level,vector-level, text-level, appearance-level, or metadata-level, definedas above), while P2 is the document data at another level. For example,P1 is text-level semantic information and P2 is the metadata,appearance-level description, or signal-level data of the document.

[0103] As used in the above descriptions, layer means a part of thedigital representation or analog form that can be separated from otherparts of the digital representation or the analog form. Examples oflayers are:

[0104] 1) the alphanumeric characters of a document and the image of theformatted document containing those alphanumeric characters.

[0105] 2) the alphanumeric characters of a document and graphicscomponents of the document that don't overlap with the characters.

[0106] 3) the alphanumeric characters and a background image thatvisually overlays the document containing the characters.

[0107] Improving the Capture of Semantic Information with OCR

[0108] The parent discloses a method of using OCR to capture thesemantic information from the analog form of the authenticated document.The problem with OCR techniques is that achieving a recognitioncorrectness rate of 100% is hard, yet this is often required by theauthentication verification techniques. One solution to this problem,“fuzzy” matching, was described in the parent. Another is includingerror correction code with the embedded authentication information thatwill permit correction of errors caused by confusing alphanumericcharacters. One simple approach is to keep track of the positions ofcommon confusing characters such as “1” or “1”, “m” or “n”, “0” or “0”in semantic information when the embedded authentication information isproduced in the original electronic form of the object and then ignorethe characters at those positions when the embedded authenticationinformation is produced. The positions of the ignored characters canthen be included as an error code in the electronic version of thedocument. During authenticity verification, the OCR'd characters at thepositions specified in the error correction code can be similarlyignored when computing the authentication information in theverification process.

[0109] Another approach is including the common confusing characterswith the embedded authentication information. The following steps areperformed when the semantic information and embedded authenticationinformation are produced in the original electronic form of the object:

[0110] 1. Sequentially search for the confusing characters in semanticinformation of the original electronic form of the object and put thesecharacters into a character stream S1.

[0111] 2. Apply encoding techniques (e.g. if there are total 7 pairs ofconfusing characters, 3 bits are needed to encode all pairs) andlossless compression such as Huffman encoding to reduce the size of thecharacter stream. The encoded and compressed stream is S2.

[0112] 3. Embed S2 as a watermark or barcode into the document in thesame way as the authentication information is embedded. The watermarkcan be embedded into a background image or into a graphics (logo, seal,stamp, etc.) that doesn't overlap with text.

[0113] S2 may be further encoded using error correcting codes such asReed-Solomon codes, BCH codes, the binary Golay code, CRC-32 or Hammingcode. For a larger document, S2 may be split into multiple pieces andeach piece may be embedded into a unit (e.g. a page) of the document. Asan alternative, the confusing characters in the semantic information maybe collected from each unit (e.g. page) of the original electronic formof the object and put into a character stream S2 for the unit. S2 isthen encoded and compressed, and embedded in this unit of the document.The advantage here is that each unit of document can beself-authenticated.

[0114] The following steps are performed when the verification process:

[0115] Sequentially search for the confusing characters in semanticinformation that is recognized by a OCR system and put these charactersinto a character stream S1

[0116] Read S2

[0117] Use S2 to correct possible errors in S1

[0118] The above steps are also applied where the error code is made ona per-unit basis.

[0119] Particularly Useful Kinds of Semantic Information

[0120] What kinds of semantic information should be used forauthentication depends on the application. In general, the semanticinformation should be information which is absolutely necessary for thedigital representation or the analog form to perform its properfunction. Some applications and the preferred semantic information forthe application are:

[0121] Banknotes or other currency: for each note: serial number,printing place, amount, treasurer

[0122] Personal identification documents: for each document: name, birthdate, issuer, expiration

[0123] Official documents generally (immigration papers, tax forms,licenses, certificates of title, diplomas, and the like): for eachdocument: name of person to whom the document pertains and documentnumber

[0124] Documents that give a person an entitlement to something (checks,credit and debit cards, shares of stock, tickets of all kinds, couponsand credit vouchers): for each document: identification of the issuingentity, details of the entitlement, and document number

[0125] Details for private documents that give entitlements depend onthe kind of private document. For a check, the semantic informationincludes the name and number of the bank or other fiduciary information,the amount, the check number, and the date of the check. For a theaterticket, it includes the theater's name, the name of the performance, thedate and time of the performance, and a serial number for the ticket.

[0126] Types of Media with which the Authentication Techniques can beUsed

[0127] The authentication techniques described above can be used withany object that has a first portion that contains semantic informationfrom which the authentication information can be computed and a secondportion that is separate from the first portion in which the semanticinformation can be embedded. The first and second portions may be partsof the object itself, may be parts of a label or document thataccompanies the object, or the object may be the first portion and thelabel the second. The first portion may be of any material from whichthe semantic information may be read and the second portion may be anymaterial which permits embedding and reading of the embeddedauthentication information.

[0128] One example here would be the authentication of an autographedbaseball; another is the authentication of a plastic ID. In the case ofthe autographed baseball, the second portion must be a separate objectfrom the baseball, since the authentication information cannot beprinted on the baseball without reducing the baseball's value. The firstportion is the signature on the baseball (treated as an image); theauthentication information is a digital signature made from the image ofthe signature and encrypted with the private key of an authenticationauthority, and the second portion is a certificate of authenticity thataccompanies the baseball and has the authentication information embeddedin it.

[0129] In the case of the plastic ID card, the first portion is the partof the card that contains the identified person's name, birth date, andother identity information and the second portion can be a photo of theidentified person. The authentication information is been incorporatedinto the photo as a digital watermark.

[0130] Combining the Authentication Techniques with Security Features inthe Analog Form: FIG. 6

[0131] A significant barrier to the use of authentication information inanalog documents where the authentication information is based onsemantic information in the analog document and embedded in a watermarkis that high-quality photocopying copies the watermark along with thesemantic information and does so in sufficient detail that theauthentication information can still be read from the watermark.

[0132] This problem can be dealt with by adding information to theanalog form that cannot be copied by even a high-quality photocopier.Ideally, the information will be in the form of machine-readablesecurity patterns. The larger the pattern or the more variations it has,the more secure the system. An example of such a pattern is a securitycode that is printed on the analog form in invisible magnetic ink. Thepattern is then used to compute the authentication information embeddedin the watermark. Since the pattern is part of the authenticationinformation embedded in the watermark, verification succeeds only ifboth the watermark and the security code have been copied. FIG. 6 showshow this is done in document 601. Document 601 includes semanticinformation 603 and watermark 607 in which authentication informationthat is computed using semantic information 603 is embedded.Additionally, document 601 includes security pattern 605 which ismachine readable but cannot be copied by a copier. The authenticationinformation embedded in watermark 607 is produced using both semanticinformation 603 and security pattern 605 and the device that reads thesecurity pattern can provide the security pattern to the device thatauthenticates the document. As can be seen from the use of securitypattern 605 to produce the authentication information, security pattern605 can be seen as a kind of semantic information that is an attributeonly of the analog form.

[0133] There are a number of techniques available for including securitypattern 605 in the analog form:

[0134] Fluorescent inks or fluorescent fibers in the analog form: theinks or fibers are revealed under ultraviolet light. Combinations ofdensity, colors, shapes and other features of the inks or fibers can beused to make a large number which is visible under ultraviolet light.This number can be detected by a device (e.g. a digital camera with a UVlens) automatically, and can also be part of the information used in theauthentication process. These inks or fibers are not transferable byphotocopying. Therefore, to forge a document, a counterfeiter must beable to access the inks/fibers as well as being able to copy the digitalwatermark.

[0135] Magnetic Inks: magnetic inks enable areas of a document to beread by a magnetic detector.

[0136] Microprinting: tiny messages can be worked into designs andprinted by either intaglio or litho printing processes. With mostcounterfeiting techniques these tiny messages are lost. The tiny messagecontains the security code.

[0137] Network Object Authentication System: FIG. 7

[0138] An advantage of the authentication techniques disclosed in theparent patent application is that local authentication is possiblebecause all of the information needed for the authentication of anobject is in the object itself. The corresponding drawback is thathaving all of the information in the object makes it much easier for acounterfeiter who is trying to understand how the authenticationinformation is embedded to do so. This problem can be avoided by havingthe embedded authentication information include information that isknown only to a trusted verification server that is available via anetwork. The object is digitized if it is not already and the digitizedversion is sent via a network to the trusted verification server, whichretrieves the embedded authentication information from the digitizedversion and compares it to the authentication information known to thetrusted verification server. The trusted verification server thenindicates to the source of the object whether the object is authentic ornot, as indicated by the results of the comparison.

[0139]FIG. 7 shows a network authentication system 701 that works asjust described. The components of network authentication system 701 areconnected by network 715, which can be any arrangement that lets thesystem components communicate with one another. The objects that areauthenticated may be either in analog form 703 or in digital form 707.In either case, they have in addition to embedded authenticationinformation 705 a reference number 704 which identifies the object tothe trusted verification server. The reference number may be any kind ofnumber, character string, code, or other pattern which can serve as anidentifier. The reference number may be represented on the object in anycomputer-readable fashion and may have other functions as well. Forexample, it can be Product Universal Barcode, ID card number, bankcardnumber, passport number, student ID, social security number, ISBN, andso forth. The reference number may further be represented on the objectas a public watermark. A public watermark is a digital watermark thatcan be read without a key or with a public-known key.

[0140] Multiple trusted verification servers may co-exist, with eachserver providing same or different verification services. For example, acluster of trusted verification servers may verify credit cards forfinancial institutions or credit bureaus, while another cluster mayverify passports and other government-issued documents. The referencenumber may serve not only to uniquely identify the document, but also toindicate the cluster of servers it should be routed to. In this case, auniversal verification server could be introduced to route the documentsto various verification servers according to their reference numbers.Thus, all users are able to verify all authenticated items through asingle point of contact (e.g. a web site). Alternatively, each of theverification servers could have the routing list for the referencenumbers and could route any document that it could not verify itself tothe proper verification server for the document.

[0141] Continuing with the details of system 701, when the object is inanalog form, the system works as shown at 702: an analog form converterconverts analog form 703 to a digitized form 706 and sends it vianetwork 715 to trusted verification server 717. Analog form converter709 also receives an indication 727 from trusted server 717 whether theanalog form is authentic. The flow of information between analog formconverter 709 and trusted verification server 717 is indicated by thedotted arrows.

[0142] When the object is in digital form, the system works as shown at708: digital form 707 is stored in local storage 713 belonging to localsystem 707. When local system 711 desires to authenticate digital form707, it uses network 715 to send a copy of the digital form to trustedverification server 717. Local system 711 also receives an indication727 from trusted server 717 whether digital form 707 is authentic. Asbefore, information flows between local system 711 and trustedverification server 717 are shown by dotted arrows.

[0143] Trusted verification server 717 has two major components: networkserver 719 and security database 729. Security database 729 contains akey database 731 that relates decryption keys 733 to reference numbers704 and an authentication information database 735 that relatesauthentication information 737 to reference numbers 704. Network server719 handles communications between it and the other components via thenetwork and also includes the components needed to do the actualverification: database interface 721, which is a query interface todatabase 729, AI reader 723, which can read the authenticationinformation from the part of the object in which it is embedded, forexample, from a watermark, and comparator 517, which compares two itemsof authentication information with one another to determine whether theymatch. Network server 719 returns the result 727 of the comparison tothe source of the object being authenticated.

[0144] Digitized form 706 and digital form 707 are both processed inexactly the same way in trusted verification server 717. Continuing withdigitized form 706, network server 719 first reads reference number 704from the digitized form and uses reference number 704 in DBI 721 toquery security database 729 for key 733 and authentication information737. Then, network server 719 provides digitized form 706 toauthentication information reader 723, which uses key 733 returned bythe query to read embedded authentication information 705 from digitizedform 706. Embedded authentication information 705 and authenticationinformation 737 returned by the query are then provided to comparator725, which determines whether the two versions of the authenticationinformation match. The result of the comparison is returned at 727 tothe source of the object being authenticated. Comparator 725 may ofcourse use any technique for comparison which returns a meaningfulresult.

[0145] Many variations on and refinements of system 701 are of coursepossible. Reference number 704 may not be part of the object, but may beinput by the user as part of the authentication process. Analog formconverter 709 may analyze the quality of digitized form 706 beforesending it to server 717 and send digitized form 706 only if theanalysis indicates that verification server 717 will be able to readreference number 704 and the part of the object in which theauthentication information is embedded. If analog form 703 includessecurity patterns like those discussed above, analog converter 709 mayalso check for the proper security patterns before sending digitizedform 706 on. If it finds patterns, it can send them to verificationserver 717 as well for checking in the same manner as described forauthentication information 737. Additionally, the security pattern mayprovide the reference number 704. The authentication information may bederived from the semantic information in the object, as described in theparent of the present application, and verification server 717 maycontain only the key needed to locate and/or decrypt the embeddedauthentication information. There may be a number of levels ofencryption. For example, the reference number may be encrypted using apublic key belonging to the trusted verification server. Additionally,key 733 may be used either to locate the embedded authenticationinformation or to decrypt it, or both, and another key may be stored indatabase 729 and used for the other purpose. Embedded authenticationinformation may be hidden in a watermark, or it may be simply containedin a barcode or other visible pattern.

[0146] Applications of Network Authentication System 701

[0147] Authentication of Credit Cards in E-Commerce

[0148] A continuing problem with E-commerce is that the Web merchant hasno proof that the person making a credit card purchase on the Internetis in actual possession of the credit card whose number he or she isproviding to the Web merchant. Network authentication server 701 cansolve this problem. In this application, analog form converter 709 is aPC that includes apparatus such as a Web camera for making an image ofthe credit card. As part of the Web purchasing procedure, the purchasercan send the image of the card to trusted verification server 717, whichcan be operated by the credit card company, the credit card provider, ora credit bureau. Server 717 performs authentication as described aboveand stops the transaction if the authentication fails. In thisapplication, the authentication information that is on the card andstored in the database may include a photograph of the user.

[0149] Authentication of Documents Generally

[0150] There are many situations in which one party needs toauthenticate a document received from another. For example, if anemployer receives a work permit from a would-be employee, the employerneeds to authenticate the work permit. Trusted verification server 717can be employed generally to solve such problems. In order for server717 to be useful, the party issuing the document must receive referencenumber 704 and authentication information 737 from the entity operatingserver 717 before making analog form 703. When analog form 703 is made,reference number 704 is printed on it, as is a watermark that hasauthentication information 737 embedded in it. The watermark may be in adiscrete part of the document, such as a seal or portrait, or it may bein the entire image of the document. In some embodiments, theverification server may provide the watermark to the issuing party withthe authentication information already imbedded in it. Anyone whoreceives analog form 703 can then use system 701 to authenticate it asjust described.

[0151] The technique just described can be applied to otheridentification ID documents such as passport, immigration papers, anddriver licenses. The major advantage of such online verification is thatthe reference number links the document with a variety of databases thathave been build for years and are stored on organizational servers. Theinformation in these databases can be used in the process of verifyingand tracking the authenticity of the document. Another advantage of thetechnique is that the devices needed to read the semantic informationand the embedded authentication information and to compare the embeddedauthentication information with either the authentication informationcomputed from the semantic information or with authenticationinformation retrieved from the database are in the server, whichsubstantially lowers the cost of the client devices to which the analogor digital forms are submitted for verification of their authenticity.

CONCLUSION

[0152] The foregoing Detailed Description has disclosed to those skilledin the arts to which the inventions disclosed therein pertain how tomake and use the inventions and has also disclosed the best modepresently known to the inventor of practicing the inventions. Theprinciples of the inventions disclosed herein are broad and applicablein many areas. While the inventors have given many specific examples ofways of using and implementing their inventions, it will be immediatelyapparent to those skilled in the relevant arts that there are many otherways in which the inventions can be used and many other ways ofimplementing the inventions. With regard to incorporating authenticationinformation in an object, all that is required is that the informationused to make the authentication information be in a location in theobject which is separate from the location where the authenticationinformation it is embedded. With objects that remain digital, thelocations need only be non-overlapping parts of the digitalrepresentation; with objects that become analog forms which are thenmade into digital representations, the locations must also not overlapin the analog form.

[0153] With regard to the verification server, the server can be at anylocation where it can receive digital representations of objects fromsources, including belonging to the same system as the source. Thereference number can be any kind of identification code and can be usedin many ways in addition to its use as a reference number. Further, theserver may do other processing in addition to performing verification,and the results of the other processing may be included in theverification process.

[0154] The method for dealing with confusing alphanumeric characters,finally, may be applied in any context where there are recurringpatterns that confuse an OCR device.

[0155] For all of the foregoing reasons, the Detailed Description is tobe regarded as being in all respects exemplary and not restrictive, andthe breadth of the invention disclosed here in is to be determined notfrom the Detailed Description, but rather from the claims as interpretedwith the full breadth permitted by the patent laws.

What is claimed is:
 1. Authenticatable hard-copy commercial papercharacterized in that: the hard-copy commercial paper includes an analogform of a digital representation; and the digital representationincludes a first part in which semantic information which describes thetransaction represented by the commercial paper and which may be readfrom the analog form is included with non-semantic information and asecond part in which a second portion of the digital representationwhich does not overlap the semantic information represents a semanticdigest of the semantic information which may be read from the analogform, whereby the hard-copy commercial paper may be authenticated byreading the semantic information from the analog form, recomputing thesemantic digest from the read semantic information, and comparing therecomputed semantic digest with the semantic digest read from the analogform.
 2. The hard-copy commercial paper set forth in claim 1 furthercharacterized in that: the semantic digest is perceptible by a humanreader of the analog form.
 3. The hard-copy commercial paper set forthin claim 2 further characterized in that: the perceptible semanticdigest includes a value which is produced by encrypting the semanticinformation.
 4. The hard-copy commercial paper set forth in claim 3further characterized in that: the value which is produced by encryptingthe semantic information contains fewer bits of data than the semanticinformation.
 5. The hard-copy commercial paper set forth in claim 2further characterized in that: the semantic digest includes a valuewhich is produced by compressing the semantic information.
 6. Thehard-copy commercial paper set forth in claim 1 further characterized inthat: the semantic digest is not perceptible by a human reader of theanalog form.
 7. The hard-copy commercial paper set forth in claim 6further characterized in that: the semantic digest is hidden in thenon-semantic information.
 8. The hard-copy commercial paper set forth inclaim 7 further characterized in that: the semantic digest is animperceptible watermark in the non-semantic information.
 9. The hardcopy commercial paper set forth in claim 8 further characterized inthat: the watermark is invisible in the non-semantic information. 10.The hard copy commercial paper set forth in claim 1 furthercharacterized in that: the commercial paper is a check on a financialinstitution; and the semantic information includes at least anidentification of the financial institution and an amount to be paid.11. The hard copy commercial paper set forth in claim 10 furthercharacterized in that: the semantic information further includes anidentification of an account from which the financial institution is topay the check.
 12. An authenticatable digital representation ofcommercial paper comprising: a first part in which semantic informationwhich describes the transaction represented by the commercial paper isincluded with non-semantic information; and a second part in which asecond portion of the digital representation which does not overlap thesemantic information includes a watermark of a semantic digest of thesemantic information, whereby the digital representation of commercialpaper may be authenticated by reading the semantic information from theanalog form, recomputing the semantic digest from the read semanticinformation, and comparing the recomputed semantic digest with thesemantic digest read from the watermark.
 13. The digital representationof commercial paper set forth in claim 12 wherein: the commercial paperis a check on a financial institution; and the semantic informationincludes at least an identification of the financial institution and anamount to be paid.
 14. The digital representation of commercial paperset forth in claim 13 further characterized in that: the semanticinformation further includes an identification of an account from whichthe financial institution is to pay the check.